16 research outputs found
Improving Neural Topic Models with Wasserstein Knowledge Distillation
Topic modeling is a dominant method for exploring document collections on the
web and in digital libraries. Recent approaches to topic modeling use
pretrained contextualized language models and variational autoencoders.
However, large neural topic models have a considerable memory footprint. In
this paper, we propose a knowledge distillation framework to compress a
contextualized topic model without loss in topic quality. In particular, the
proposed distillation objective is to minimize the cross-entropy of the soft
labels produced by the teacher and the student models, as well as to minimize
the squared 2-Wasserstein distance between the latent distributions learned by
the two models. Experiments on two publicly available datasets show that the
student trained with knowledge distillation achieves topic coherence much
higher than that of the original student model, and even surpasses the teacher
while containing far fewer parameters than the teacher's. The distilled model
also outperforms several other competitive topic models on topic coherence.Comment: Accepted at ECIR 202
Do Neural Topic Models Really Need Dropout? Analysis of the Effect of Dropout in Topic Modeling
Dropout is a widely used regularization trick to resolve the overfitting
issue in large feedforward neural networks trained on a small dataset, which
performs poorly on the held-out test subset. Although the effectiveness of this
regularization trick has been extensively studied for convolutional neural
networks, there is a lack of analysis of it for unsupervised models and in
particular, VAE-based neural topic models. In this paper, we have analyzed the
consequences of dropout in the encoder as well as in the decoder of the VAE
architecture in three widely used neural topic models, namely, contextualized
topic model (CTM), ProdLDA, and embedded topic model (ETM) using four publicly
available datasets. We characterize the dropout effect on these models in terms
of the quality and predictive performance of the generated topics.Comment: Accepted at EACL 202
Improving Contextualized Topic Models with Negative Sampling
Topic modeling has emerged as a dominant method for exploring large document
collections. Recent approaches to topic modeling use large contextualized
language models and variational autoencoders. In this paper, we propose a
negative sampling mechanism for a contextualized topic model to improve the
quality of the generated topics. In particular, during model training, we
perturb the generated document-topic vector and use a triplet loss to encourage
the document reconstructed from the correct document-topic vector to be similar
to the input document and dissimilar to the document reconstructed from the
perturbed vector. Experiments for different topic counts on three publicly
available benchmark datasets show that in most cases, our approach leads to an
increase in topic coherence over that of the baselines. Our model also achieves
very high topic diversity.Comment: Accepted at 19th International Conference on Natural Language
Processing (ICON 2022
Segmenting Scientific Abstracts into Discourse Categories: A Deep Learning-Based Approach for Sparse Labeled Data
The abstract of a scientific paper distills the contents of the paper into a
short paragraph. In the biomedical literature, it is customary to structure an
abstract into discourse categories like BACKGROUND, OBJECTIVE, METHOD, RESULT,
and CONCLUSION, but this segmentation is uncommon in other fields like computer
science. Explicit categories could be helpful for more granular, that is,
discourse-level search and recommendation. The sparsity of labeled data makes
it challenging to construct supervised machine learning solutions for automatic
discourse-level segmentation of abstracts in non-bio domains. In this paper, we
address this problem using transfer learning. In particular, we define three
discourse categories BACKGROUND, TECHNIQUE, OBSERVATION-for an abstract because
these three categories are the most common. We train a deep neural network on
structured abstracts from PubMed, then fine-tune it on a small hand-labeled
corpus of computer science papers. We observe an accuracy of 75% on the test
corpus. We perform an ablation study to highlight the roles of the different
parts of the model. Our method appears to be a promising solution to the
automatic segmentation of abstracts, where the labeled data is sparse.Comment: to appear in the proceedings of JCDL'202
Generation of Highlights from Research Papers Using Pointer-Generator Networks and SciBERT Embeddings
Nowadays many research articles are prefaced with research highlights to
summarize the main findings of the paper. Highlights not only help researchers
precisely and quickly identify the contributions of a paper, they also enhance
the discoverability of the article via search engines. We aim to automatically
construct research highlights given certain segments of the research paper. We
use a pointer-generator network with coverage mechanism and a contextual
embedding layer at the input that encodes the input tokens into SciBERT
embeddings. We test our model on a benchmark dataset, CSPubSum and also present
MixSub, a new multi-disciplinary corpus of papers for automatic research
highlight generation. For both CSPubSum and MixSub, we have observed that the
proposed model achieves the best performance compared to related variants and
other models proposed in the literature. On the CSPubSum data set, our model
achieves the best performance when the input is only the abstract of a paper as
opposed to other segments of the paper. It produces ROUGE-1, ROUGE-2 and
ROUGE-L F1-scores of 38.26, 14.26 and 35.51, respectively, METEOR F1-score of
32.62, and BERTScore F1 of 86.65 which outperform all other baselines. On the
new MixSub data set, where only the abstract is the input, our proposed model
(when trained on the whole training corpus without distinguishing between the
subject categories) achieves ROUGE-1, ROUGE-2 and ROUGE-L F1-scores of 31.78,
9.76 and 29.3, respectively, METEOR F1-score of 24.00, and BERTScore F1 of
85.25, outperforming other models.Comment: 18 pages, 7 figures, 7 table
An overview of device-to-device communication in cellular networks
Device-to-device (D2D) communication is expected to play a significant role in upcoming cellular networks as it promises ultra-low latency for communication among users. This new mode may operate in licensed or unlicensed spectrum. It is a novel addition to the traditional cellular communication paradigm. Its benefits are, however, accompanied by many technical and business issues that must be resolved before integrating it into the cellular ecosystem. This paper discusses the main characteristics of D2D communication including its usage scenarios, architecture, technical features, and areas of active research. Keywords: Device-to-device communication (D2D), Cellular network, 5G, Resource management, LTE direc
Dimensionality reduction of EEG signal using Fuzzy Discernibility Matrix.
2017 10th International Conference on Human System Interactions (HSI)131-13
Discernibility matrix based dimensionality reduction for EEG signal
10.1109/tencon.2016.7848530TENCON 2016 - 2016 IEEE Region 10 Conferenc